39 research outputs found
Subtree power analysis finds optimal species for comparative genomics
Sequence comparison across multiple organisms aids in the detection of
regions under selection. However, resource limitations require a prioritization
of genomes to be sequenced. This prioritization should be grounded in two
considerations: the lineal scope encompassing the biological phenomena of
interest, and the optimal species within that scope for detecting functional
elements. We introduce a statistical framework for optimal species subset
selection, based on maximizing power to detect conserved sites. In a study of
vertebrate species, we show that the optimal species subset is not in general
the most evolutionarily diverged subset. Our results suggest that marsupials
are prime sequencing candidates.Comment: 16 pages, 3 figures, 3 table
Comment on "Support Vector Machines with Applications"
Comment on "Support Vector Machines with Applications" [math.ST/0612817]Comment: Published at http://dx.doi.org/10.1214/088342306000000475 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Approximate Inference for Constructing Astronomical Catalogs from Images
We present a new, fully generative model for constructing astronomical
catalogs from optical telescope image sets. Each pixel intensity is treated as
a random variable with parameters that depend on the latent properties of stars
and galaxies. These latent properties are themselves modeled as random. We
compare two procedures for posterior inference. One procedure is based on
Markov chain Monte Carlo (MCMC) while the other is based on variational
inference (VI). The MCMC procedure excels at quantifying uncertainty, while the
VI procedure is 1000 times faster. On a supercomputer, the VI procedure
efficiently uses 665,000 CPU cores to construct an astronomical catalog from 50
terabytes of images in 14.6 minutes, demonstrating the scaling characteristics
necessary to construct catalogs for upcoming astronomical surveys.Comment: accepted to the Annals of Applied Statistic
A Spatially Varying Two-Sample Recombinant Coalescent, With Applications to HIV Escape Response
Statistical evolutionary models provide an important mechanism for describing and understanding the escape response of a viral population under a particular therapy. We present a new hierarchical model that incorporates spatially varying mutation and recombination rates at the nucleotide level. It also maintains sep- arate parameters for treatment and control groups, which allows us to estimate treatment effects explicitly. We use the model to investigate the sequence evolu- tion of HIV populations exposed to a recently developed antisense gene therapy, as well as a more conventional drug therapy. The detection of biologically rele- vant and plausible signals in both therapy studies demonstrates the effectiveness of the method
Topic Modeling and Text Analysis for Qualitative Policy Research
This paper contributes to a critical methodological discussion that has direct ramifications for policy studies: how computational methods can be concretely incorporated into existing processes of textual analysis and interpretation without compromising scientific integrity. We focus on the computational method of topic modeling and investigate how it interacts with two larger families of qualitative methods: content and classification methods characterized by interest in words as communication units and discourse and representation methods characterized by interest in the meaning of communicative acts. Based on analysis of recent academic publications that have used topic modeling for textual analysis, our findings show that different mixedâmethod research designs are appropriate when combining topic modeling with the two groups of methods. Our main concluding argument is that topic modeling enables scholars to apply policy theories and concepts to much larger sets of data. That said, the use of computational methods requires genuine understanding of these techniques to obtain substantially meaningful results. We encourage policy scholars to reflect carefully on methodological issues, and offer a simple heuristic to help identify and address critical points when designing a study using topic modeling.Peer reviewe
STRIDER (Sildenafil TheRapy in dismal prognosis early onset fetal growth restriction): An international consortium of randomised placebo-controlled trials
Background: Severe, early-onset fetal growth restriction due to placental insufficiency is associated with a high risk of perinatal mortality and morbidity with long-lasting sequelae. Placental insufficiency is the result of abnormal formation and function of the placenta with inadequate remodelling of the maternal spiral arteries. There is currently no effective therapy available. Some evidence suggests sildenafil citrate may improve uteroplacental blood flow, fetal growth, and meaningful infant outcomes. The objective of the Sildenafil TheRapy In Dismal prognosis Early onset fetal growth Restriction (STRIDER) collaboration is to evaluate the effectiveness of sildenafil versus placebo in achieving healthy perinatal survival through the conduct of randomised clinical trials and systematic review including individual patient data meta-analysis. Methods: Five national/bi-national multicentre randomised placebo-controlled trials have been launched. Women with a singleton pregnancy between 18 and 30 weeks with severe fetal growth restriction of likely placental origin, and where the likelihood of perinatal death/severe morbidity is estimated to be significant are included. Participants will receive either sildenafil 25 mg or matching placebo tablets orally three times daily from recruitment to 32 weeks gestation. Discussion: The STRIDER trials were conceived and designed through international collaboration. Although the individual trials have different primary outcomes for reasons of sample size and feasibility, all trials will collect a standard set of outcomes including survival without severe neonatal morbidity at time of hospital discharge. This is a summary of all the STRIDER trial protocols and provides an example of a prospectively planned international clinical research collaboration. All five individual trials will contribute to a pre-planned systematic review of the topic including individual patient data meta-analysis
Variational Inference for Deblending Crowded Starfields
In the image data collected by astronomical surveys, stars and galaxies often
overlap. Deblending is the task of distinguishing and characterizing individual
light sources from survey images. We propose StarNet, a fully Bayesian method
to deblend sources in astronomical images of crowded star fields. StarNet
leverages recent advances in variational inference, including amortized
variational distributions and the wake-sleep algorithm. Wake-sleep, which
minimizes forward KL divergence, has significant benefits compared to
traditional variational inference, which minimizes a reverse KL divergence. In
our experiments with SDSS images of the M2 globular cluster, StarNet is
substantially more accurate than two competing methods: Probablistic Cataloging
(PCAT), a method that uses MCMC for inference, and a software pipeline employed
by SDSS for deblending (DAOPHOT). In addition, StarNet is as much as
times faster than PCAT, exhibiting the scaling characteristics necessary to
perform fully Bayesian inference on modern astronomical surveys.Comment: 37 pages; 20 figures; 3 tables. Submitted to the Journal of the
American Statistical Associatio
Multiple-sequence functional annotation and the generalized hidden Markov phylogeny
To whom correspondence should be addressed